48 research outputs found
Real-time Convolutional Neural Networks for Emotion and Gender Classification
In this paper we propose an implement a general convolutional neural network
(CNN) building framework for designing real-time CNNs. We validate our models
by creating a real-time vision system which accomplishes the tasks of face
detection, gender classification and emotion classification simultaneously in
one blended step using our proposed CNN architecture. After presenting the
details of the training procedure setup we proceed to evaluate on standard
benchmark sets. We report accuracies of 96% in the IMDB gender dataset and 66%
in the FER-2013 emotion dataset. Along with this we also introduced the very
recent real-time enabled guided back-propagation visualization technique.
Guided back-propagation uncovers the dynamics of the weight changes and
evaluates the learned features. We argue that the careful implementation of
modern CNN architectures, the use of the current regularization methods and the
visualization of previously hidden features are necessary in order to reduce
the gap between slow performances and real-time architectures. Our system has
been validated by its deployment on a Care-O-bot 3 robot used during
RoboCup@Home competitions. All our code, demos and pre-trained architectures
have been released under an open-source license in our public repository.Comment: Submitted to ICRA 201
Image Captioning and Classification of Dangerous Situations
Current robot platforms are being employed to collaborate with humans in a
wide range of domestic and industrial tasks. These environments require
autonomous systems that are able to classify and communicate anomalous
situations such as fires, injured persons, car accidents; or generally, any
potentially dangerous situation for humans. In this paper we introduce an
anomaly detection dataset for the purpose of robot applications as well as the
design and implementation of a deep learning architecture that classifies and
describes dangerous situations using only a single image as input. We report a
classification accuracy of 97 % and METEOR score of 16.2. We will make the
dataset publicly available after this paper is accepted
Difficulty Estimation With Action Scores for Computer Vision Tasks
As more machine learning models are now being applied in real world scenarios it has become crucial to evaluate their difficulties and biases. In this paper we present an unsupervised method for calculating a difficulty score based on the accumulated loss per epoch. Our proposed method does not require any modification to the model, neither any external supervision, and it can be easily applied to a wide range of machine learning tasks. We provide results for the tasks of image classification, image segmentation, and object detection. We compare our score against similar metrics and provide theoretical and empirical evidence of their difference. Furthermore, we show applications of our proposed score for detecting incorrect labels, and test for possible biases
Difficulty Estimation With Action Scores for Computer Vision Tasks
As more machine learning models are now being applied in real world scenarios it has become crucial to evaluate their difficulties and biases. In this paper we present an unsupervised method for calculating a difficulty score based on the accumulated loss per epoch. Our proposed method does not require any modification to the model, neither any external supervision, and it can be easily applied to a wide range of machine learning tasks. We provide results for the tasks of image classification, image segmentation, and object detection. We compare our score against similar metrics and provide theoretical and empirical evidence of their difference. Furthermore, we show applications of our proposed score for detecting incorrect labels, and test for possible biases
Difficulty Estimation With Action Scores for Computer Vision Tasks
As more machine learning models are now being applied in real world scenarios it has become crucial to evaluate their difficulties and biases. In this paper we present an unsupervised method for calculating a difficulty score based on the accumulated loss per epoch. Our proposed method does not require any modification to the model, neither any external supervision, and it can be easily applied to a wide range of machine learning tasks. We provide results for the tasks of image classification, image segmentation, and object detection. We compare our score against similar metrics and provide theoretical and empirical evidence of their difference. Furthermore, we show applications of our proposed score for detecting incorrect labels, and test for possible biases
Difficulty Estimation With Action Scores for Computer Vision Tasks
As more machine learning models are now being applied in real world scenarios it has become crucial to evaluate their difficulties and biases. In this paper we present an unsupervised method for calculating a difficulty score based on the accumulated loss per epoch. Our proposed method does not require any modification to the model, neither any external supervision, and it can be easily applied to a wide range of machine learning tasks. We provide results for the tasks of image classification, image segmentation, and object detection. We compare our score against similar metrics and provide theoretical and empirical evidence of their difference. Furthermore, we show applications of our proposed score for detecting incorrect labels, and test for possible biases
DExT: Detector Explanation Toolkit
State-of-the-art object detectors are treated as black boxes due to their
highly non-linear internal computations. Even with unprecedented advancements
in detector performance, the inability to explain how their outputs are
generated limits their use in safety-critical applications. Previous work fails
to produce explanations for both bounding box and classification decisions, and
generally make individual explanations for various detectors. In this paper, we
propose an open-source Detector Explanation Toolkit (DExT) which implements the
proposed approach to generate a holistic explanation for all detector decisions
using certain gradient-based explanation methods. We suggests various
multi-object visualization methods to merge the explanations of multiple
objects detected in an image as well as the corresponding detections in a
single image. The quantitative evaluation show that the Single Shot MultiBox
Detector (SSD) is more faithfully explained compared to other detectors
regardless of the explanation methods. Both quantitative and human-centric
evaluations identify that SmoothGrad with Guided Backpropagation (GBP) provides
more trustworthy explanations among selected methods across all detectors. We
expect that DExT will motivate practitioners to evaluate object detectors from
the interpretability perspective by explaining both bounding box and
classification decisions.Comment: 24 pages, with appendix. 1st World Conference on eXplainable
Artificial Intelligence camera read
DExT:Detector Explanation Toolkit
State-of-the-art object detectors are treated as black boxes due to their highly non-linear internal computations. Even with unprecedented advancements in detector performance, the inability to explain how their outputs are generated limits their use in safety-critical applications. Previous work fails to produce explanations for both bounding box and classification decisions, and generally make individual explanations for various detectors. In this paper, we propose an open-source Detector Explanation Toolkit (DExT) which implements the proposed approach to generate a holistic explanation for all detector decisions using certain gradient-based explanation methods. We suggests various multi-object visualization methods to merge the explanations of multiple objects detected in an image as well as the corresponding detections in a single image. The quantitative evaluation show that the Single Shot MultiBox Detector (SSD) is more faithfully explained compared to other detectors regardless of the explanation methods. Both quantitative and human-centric evaluations identify that SmoothGrad with Guided Backpropagation (GBP) provides more trustworthy explanations among selected methods across all detectors. We expect that DExT will motivate practitioners to evaluate object detectors from the interpretability perspective by explaining both bounding box and classification decisions
Sanity Checks for Saliency Methods Explaining Object Detectors
Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the model itself than the explanation method. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations, both for object classification and bounding box decisions, using Guided Backpropagation, Integrated Gradients, and their Smoothgrad versions, together with Faster R-CNN, SSD, and EfficientDet-D0, trained on COCO. In addition, the sensitivity of the explanation method to model parameters and data labels varies class-wise motivating to perform the sanity checks for each class. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method, which passes the sanity checks with little problems